Resumen:
The increasing amount of real-time data collected from sensors in industrial environments has accelerated the application of machine learning in decision-making. Reinforcement learning (RL) is a powerful tool to find optimal policies for achieving a given goal. However, RL’s typical application is risky and insufficient in environments where actions can have irreversible consequences and require interpretability and fairness. While new trends in RL may provide guidance based on expert knowledge, they do not often consider uncertainty or include prior knowledge in the learning process. We propose a causal reinforcement learning alternative based on Bayesian networks (RLBNs) to address this challenge. The RLBN simultaneously models a policy and takes advantage of the joint distribution of the state and action space, reducing uncertainty in unknown situations. We propose a training algorithm for the network’s parameters and structure based on the reward function and likelihood of the effects and measurements taken. Our experiment with the CartPole benchmark and industrial fouling using ordinary differential equations (ODEs) demonstrates that RLBNs are interpretable, secure, flexible, and more robust than their competitors. Our contributions include a novel method that incorporates expert knowledge into the decision-making engine. It uses Bayesian networks with a predefined structure as a causal graph and a hybrid learning strategy that considers both likelihood and reward. This would avoid losing the virtues of the Bayesian network.
Palabras Clave: Reinforcement learning; Bayesian networks; Causality; Parameter learning; Dynamic simulators; Ordinary differential equations
Índice de impacto JCR y cuartil WoS: 7,500 - Q1 (2023)
Referencia DOI: https://doi.org/10.1016/j.engappai.2023.106657
Publicado en papel: Octubre 2023.
Publicado on-line: Julio 2023.
Cita:
G. Valverde, D. Quesada, P. Larrañaga, C. Bielza, Causal reinforcement learning based on Bayesian networks applied to industrial settings. Engineering Applications of Artificial Intelligence. Vol. 125, pp. 106657-1 - 106657-17, Octubre 2023. [Online: Julio 2023]